Mitigating Data Imbalance Problem in Transformer-Based Intent Detection
نویسندگان
چکیده
There are two major problems when deploying a practical intent detection system for new customer. First, domain-specific data from the customer could be limited and imbalanced. Additionally, despite different customers might share same domain, their categories each other. Thus, it difficult to combine datasets collected into single larger one. In this paper, we use class weights in loss computation alleviate imbalance problem. The defined inversely proportional frequency of training set order give more influence less observed classes. We also employ two-pass fine-tuning procedure utilize information in-domain datasets. Experimental results show that performance is improved significantly weighted function used together with transfer learning procedure. absolute improvement percent accuracy approximately 2% over transformer-based baseline.
منابع مشابه
Class Imbalance Problem in Data Mining Review
In last few years there are major changes and evolution has been done on classification of data. As the application area of technology is increases the size of data also increases. Classification of data becomes difficult because of unbounded size and imbalance nature of data. Class imbalance problem become greatest issue in data mining. Imbalance problem occur where one of the two classes havi...
متن کاملGeometric Mean based Boosting Algorithm to Resolve Data Imbalance Problem
In classification or prediction tasks, data imbalance problem is frequently observed when most of samples belong to one majority class. Data imbalance problem has received a lot of attention in machine learning community because it is one of the causes that degrade the performance of classifiers or predictors. In this paper, we propose geometric mean based boosting algorithm (GMBoost) to resolv...
متن کاملData Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process
Fault detection prediction of FAB (wafer fabrication) process in semiconductor manufacturing process is possible that improve product quality and reliability in accordance with the classification performance. However, FAB process is sometimes due to a fault occurs. And mostly it occurs “pass”. Hence, data imbalance occurs in the pass/fail class. If the data imbalance occurs, prediction models a...
متن کاملClass Imbalance Problem in Data Mining using Probabilistic Approach
Class imbalance problem are raised when one class having maximum number of examples than other classes. The classical classifiers of balance datasets cannot deal with the class imbalance problem because they pay more attention to the majority class. The main drawback associated with it majority class is loss of important information. The Class imbalance problem is a difficult due to the amount ...
متن کاملAlleviating the Class Imbalance problem in Data Mining
The class imbalance problem in two-class data sets is one of the most important problems. When examples of one class in a training data set vastly outnumber examples of the other class, standard machine learning algorithms tend to be overwhelmed by the majority class and ignore the minority class. There are several algorithms to alleviate the problem of class imbalance in literature. In this pa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Europan journal of science and technology
سال: 2022
ISSN: ['2148-2683']
DOI: https://doi.org/10.31590/ejosat.1044812